-
Notifications
You must be signed in to change notification settings - Fork 174
[CIR][ThroughMLIR] Lower simple SwitchOp. #1871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
terapines-osc-cir
wants to merge
2,698
commits into
llvm:main
Choose a base branch
from
Terapines:cir-switch-2025-0828
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+168,804
−44,698
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This adds an explicit alignment to load instructions where needed in order to make CIR load alignment consistent with classic codegen. As with aligned stores, I have updated tests that were failing with wildcard checks for cir.load, except where alignment was being specifically checked. Where tests failed because of changed alignment, I verified that the new alignment matches what is produced by classic codegen. The new test for proper alignment behavior is align-load.c.
…valuate unconditionally (llvm#1642) This came up during the review of llvm/llvm-project#138156 During codegen we check whether the LHS and RHS of the conditional operator are cheap enough to evaluate uncondionally. Unlike classic codegen we still emit `TernaryOp` instead of `SelectOp` and defer that optimization to cir-simplify. This patch changes codegen to directly emit `SelectOp` for `cond ? constant : constant` expressions.
The verification used to require a runtime type query of `ConditionOp` which is not available outside the interface package; whereas the dialect package depends on the interface package, causing a circular dependency. This commit adds a singleton interface for `ConditionOp` to break off the circular dependency, and replaces the runtime type query using the newly added interface. Tested locally with a build with shared library enabled and verified that the build succeeds.
…m#1643) Backport improvements in ShiftOp for vectors from llvm/llvm-project#141111.
The transformation functions are all named `transferToXXXOp`. Are those typos? Co-authored-by: Yue Huang <[email protected]>
…llvm#1645) - Uses `getI<bitwidth>IntegerAttr` builder method instead of explicit attribute and its type creation. - Adds few helper functions `getAlignmentAttr` to build alignment representing `mlir::IntegerAttr`. - Removes duplicit type parameters, that are inferred from `mlir::IntegerAttr`.
…lvm#1646) Currently, the following code snippet crashes during flattening, before lowering to llvm: ``` struct S { int a, b; }; void foo() { try { S s{1, 2}; } catch (...) { } } ``` Command to reproduce: ``` clang tmp.cpp -Xclang -fclangir -Xclang -emit-cir-flat -S -o - ``` The crash happens when flattening a TryOp with an empty catch region and [building catchers](https://github.com/llvm/clangir/blob/791c327da623e4cb1c193422f4b7a555f572b70a/clang/lib/CIR/Dialect/Transforms/FlattenCFG.cpp#L423). Something like: ``` "cir.try"() ({ }, { }) : () -> () ``` the crash happens at [`tryOp.isCatchAllOnly()`](https://github.com/llvm/clangir/blob/791c327da623e4cb1c193422f4b7a555f572b70a/clang/lib/CIR/Dialect/Transforms/FlattenCFG.cpp#L441C39-L441C61) to be specific, because the catch types attribute list is empty. The fix is simple - adding a check for an empty/non-existent catch region before building the catch clauses. This PR adds this fix and one test. **Side-note:** This enables `push_back` for `std::vector` to be lowered to llvm, for example: ``` #include <vector> void foo() { std::vector<int> v; v.push_back(1); } ```
The current implementation incorrectly uses `mlir::IntegerAttr::get` with the unsupported type `cir::IntType`, which is not compatible and was never tested. As discussed in PR llvm#1645, this is to be marked as NYI until a proper implementation is provided.
Backport the calculation of maskbits in the lowering from `N - 1` to `NextPowerOf2(numElements - 1) - 1`, similar to Clang CG. Backport from [#141411](llvm/llvm-project#141411)
…lvm#1651) We used to insert a continue Block at the end of a flattened ternary op that only contained a branch to the remaing operation of the remaining Block. This patch removes that continue block and changes the true/false blocks to directly jump to the remaining ops. With this patch the CIR now generates exactly the same LLVM IR as the original codegen.
This PR fixes [Issue#1647](llvm#1647). It just takes the implementation from [`emitRethrow`](https://github.com/llvm/clangir/blob/105d898b9898d224f0baca4b161a84bdcf817617/clang/lib/CIR/CodeGen/CIRGenItaniumCXXABI.cpp#L2273C1-L2276C77) and extends the same logic to `emitThrow`. The only nitpick about the fix is same as before - we have this [redundant ScopeOp](https://github.com/llvm/clangir/blob/105d898b9898d224f0baca4b161a84bdcf817617/clang/lib/CIR/CodeGen/CIRGenItaniumCXXABI.cpp#L2298C1-L2303C37) which acts as a placeholder, so there are some redundant yield blocks in some cases. Aside that, I believe this fix is okay for now. I have added the tests from the issue to confirm everything works as intended. cc: @mmha, @bcardosolopes.
We already have constraints in CIROps to make sure that the operands and result type are vectors - [VecCmpOp](https://github.com/llvm/clangir/blob/6e5fa09550c98f84d017873ed3e5667fd5fd909c/clang/include/clang/CIR/Dialect/IR/CIROps.td#L3237), - [VecTernaryOp](https://github.com/llvm/clangir/blob/6e5fa09550c98f84d017873ed3e5667fd5fd909c/clang/include/clang/CIR/Dialect/IR/CIROps.td#L3269-L3272)
LLVM dialect now has ptrmask intrinsic, use it instead of the manual computation Fix bitwidth of the generated mask in ABIInfoImpl
Backport support global initialization for ComplexType from (llvm/llvm-project#141369)
…lvm#1666) This adds missing print of `dso_local` to FuncOp. Attribute `dsolocal` was renamed in both `FuncOp` and `GlobalOp` to align with LLVM naming.
Implement atan2 intrinsic as part of llvm#1192
…llvm#1660) Fixes llvm#1405 as far as I understand it eraseIfSafe should intuativly check if all memref load/store ops are created, which obtain offsets from the memref.reinterpret_cast in the eraseList. If so the operations in the eraseList are erased, otherwise they are kept until all cir.load/store ops relying on them are lowered. One challenge here is that we can't actually do this by checking the uses of memref.reinterpret_cast operations, as their results aren't actually used in the created memref load/store ops (the base alloca result found via findBaseAndIndices is used). Because of this, this base alloca result is passed as the newAddr Value to eraseIfSafe in the [cir.load](https://github.com/llvm/clangir/blob/6e5fa09550c98f84d017873ed3e5667fd5fd909c/clang/lib/CIR/Lowering/ThroughMLIR/LowerCIRToMLIR.cpp#L236C5-L242C6)/[cir.store](https://github.com/llvm/clangir/blob/6e5fa09550c98f84d017873ed3e5667fd5fd909c/clang/lib/CIR/Lowering/ThroughMLIR/LowerCIRToMLIR.cpp#L266C1-L271C6) lowerings. Currently the eraseIfSafe function counts all memref.load/store values that use this base address: https://github.com/llvm/clangir/blob/6e5fa09550c98f84d017873ed3e5667fd5fd909c/clang/lib/CIR/Lowering/ThroughMLIR/LowerCIRToMLIR.cpp#L215-L218 The problem here is that this also counts all the other memref.load/store ops, which store/load to/from the base address, but don't use the memref.reinterpret_cast ops to obtain the offsets. Because of this the lowering fails if multiple store/loads to/from the same array are performed in the original C code as in the example of issue memref.load/store ops, the newUsedNum is (for the later stores) larger than oldUsedNum (only the uses of the cir.ptr_stride op) and the memref.reinterpret_cast ops are not removed. This PR contains a first attempt to fix this (i.e only count the memref.load/store ops, which obtain offsets from the memref.reinterpret_cast in the eraseList). I only count memref.load/store ops, if the first offset value, corresponds to the offset value in the last memref.reinterpret_cast. Limitations of this PR: This fixes the indirect lowering of the example in issue llvm#1405 and also works for other test I made where multiple store/loads to/from the same array, but assumes two thing to be the case: 1. The cir.const used as the stride in the cir.ptr_str is not reused in other cir.ptr_stride ops 2. Only the last cir.ptr_stride can have multiple uses (for multidim arrays) Both of these assumptions seem to be true for the C-Code I testet (for the translation of accesses to C/C++ Arrays to cir ops). But the eraseIfSafe function might need to be changed/further improved in the future to support cases, where those assumptions fail. For example if an optimization is run on cir where the cir.const ops with the same value are reused for the different cir.ptr_stride ops, the indirect lowering would still fail. Or if in a multidimensional array a subarray is accessed, e.g. ```c int arr[3][4]; int *row = arr[1]; ``` (Note: I pretty sure for this it isn't suffiicient to just extend the function to check if all offset value, corresponds to the offset value in the all memref.reinterpret_cast, but we would probably need to seperatly check for each memref.reinterpret_cast if it can be removed (instead of removing all or non in the eraseList)) While debugging issue llvm#1405 I noticed a few thing that I think could be improved in the canonical ForOp lowering: 1. There is one edge case, where the forOp should not be marked as canonical in my opinion: ```c int i; for (i = 0; i < 100; i++); i += 10; ``` (with the current lowering this for is marked canonical but since i is replaced by the induction var of the scf.for op and the actual memory representing i is not updated i has a wrong value after the for. This is avoided when we lower this for as a non canonical for.) 2. I think we can directly replace the loads to the CIR.IV with the scf.IV and not create the dummy arith.add IV, 0 op (I think this might be a relic from previous MLIR version where the replaceOp only worked with operations (not values). This make the IR more readable and easier to understand. If I'm missing somethink here and the arith.add IV, 0 has a purpose I'm not seeing let me know. 3. When implementing the change in 1, we know that in a canonical for the induction variable is definied inside the for and is only valid here. Because of this and since we replace the loads of the cir IV with the scf.IV we can remove the unneccessary alloca and store op created for the cir.IV (These changes only show up in an non-optimized binary, but aren't relevant when running clang with optimations, I still think they improve the readability + understandability of the core ir) I also noticed, that we are currently only running the SCFPreparePass when we are printing the result of the cir to core dialect translation. https://github.com/llvm/clangir/blob/6e5fa09550c98f84d017873ed3e5667fd5fd909c/clang/lib/CIR/CodeGen/CIRPasses.cpp#L84-L85 Because of this compiling to an object file (or llvm IR) with the indirect lowering path fails, if the code contains a canonical for. I suggest always running this pass, when were going throughMLIR. ## Passing through is_nontemporal in loads/store lowerings: Since the corresponding memref ops also have this attribute it's basically just passing through a boolean (and doesn't need any special handling, I think). Even tho there is probably no practical application now I think this might avoid bugs/confusion in the future. If there is any reason not to do this let me know. I also added a new test case for arrays, adjusted the canonical forOp test to reflect the made changes and combined the non canonical forOp tests into one file and added a test case for the edge case describe before. (Note: if I find the time I will try to run the SingleSource test suite with the throughMLIR lowering in the next week to get a better idea, where we are with this pipeline. In general I agree with everything discussed in issue llvm#1219, but I think we probably can already add more support in regard to arrays (and maybe pointers) with the existing mlir core constructs)
) This PR introduces [`TryMarkNoThrow`](https://github.com/llvm/clangir/blob/6e5fa09550c98f84d017873ed3e5667fd5fd909c/clang/lib/CodeGen/CodeGenFunction.cpp#L1394). [`isInterposable`](https://github.com/llvm/clangir/blob/6e5fa09550c98f84d017873ed3e5667fd5fd909c/clang/lib/CodeGen/CodeGenFunction.cpp#L1397C10-L1397C26) isn't fully implemented and I'm not quite sure we need it? Anyway, I have introduced a missing feature `getSemanticInterposition` relevant for its completion. I have also updated an old test -- [`foo()`](https://github.com/llvm/clangir/blob/6e5fa09550c98f84d017873ed3e5667fd5fd909c/clang/test/CIR/CodeGen/try-catch-dtors.cpp#L313) should be marked as unwind/nothrow. I have compared with the original CodeGen and attached the llvm output for verification. One concern I have is if the cases I have to mimic [`mayThrow`](https://github.com/llvm/clangir/blob/6e5fa09550c98f84d017873ed3e5667fd5fd909c/llvm/lib/IR/Instruction.cpp#L1158) from the OG are enough, please let me know your thoughts.
This PR adds support for the `-fdump-record-layouts` flag. It enables printing both the `CIRGenRecordLayout` and the `ASTRecordLayout`, similar to what is done in CodeGen.
Backport support for Complex value initialization from the empty InitList. Backported from llvm/llvm-project#143192
) Currently we can't handle continues nested under `IfOp`, because if we replace it with a yield, then it only breaks out of that `if`-statement, rather than continuing the whole loop. Perhaps that should be done by changing the whole structure of the while loop. Co-authored-by: Yue Huang <[email protected]>
…llvm#1670) Backport the VecShuffleOp verifier to catch invalid index Implemented in llvm/llvm-project#143262
…llvm#1673) When we process a completed Enum type, we were checking to see if the completed type was in the type cache and clearing the cache if the completed and converted underlying type for the enum doesn't pass an `isInteger(32)` check. Unfortunately, this checks to see if the type is the MLIR builtin 32-bit integer type, whereas it will always be a CIR integer type, so the check always fails. I don't believe there can ever be a case where the forward declared type for the enum doesn't match the completed type, so we should never need to clear the cache. This change replaces the previous check with an assert that compares the actual completed type to the cached type.
…vm#1672) This removes unnecessary parens from the assembly format of BaseClassAddrOp, DerivedClassAddrOp, BaseDataMemberOp, DerivedDataMemberOp, BaseMethodOp, and DerivedMethodOp to bring them into conformance with the CIR ASM Style Guide. The is no function change beyond the assembly format change.
- Replace std::map with llvm::StringMap and std::vector with llvm::SmallVector for improved performance. - Preserve the behavior - Remove unused headers
This PR implements some missing blocks that allow us to effectively allow us to launch kernels from the host. All of the tests stated in this [commit](llvm@69f2099) are now resolved. I spent half a day figuring the following: I tried experiementing performing host compilation(`-fcuda-is-device`) with target triple: `nvptx64-nvidia-cuda` but was getting a module verification error that, to keep it simple looked like: `error: 'cir.call' op calling convention mismatch: expected ptx_kernel, but provided c`. I thought that was expected given that we're essentially using the device to compile on the host, which doesn't make a lot of sense. until I tried to replicate the same in OG and didn't really run into any problem in that regard. Are the calling conventions enforced in CIR much more strict as compared to OG? Or is that simply a bug from OG?
Fixes llvm#1818 - Implement createVecCompare, getCIRIntOrFloatBitWidth, getVectorFCmpIR helper for VecCmp op creation. - Add clang/test/CIR/CodeGen/builtin-fcmp-sse.c test. in OG, there is a sext from bool to int before casting to float vector since fcmp's result in llvm ir is boolean-like, while VecCmpOp in CIR returns int in the form of 0 or -1. There is also a boolean `shouldInvert` in CIR since CIR doesn't contain optimized unordered comparison, for example: OLE is the inverse predicate of UGT. So if we need UGT, we have to pass in OLE and `shouldInvert = true`
Backport using ArrayOf constraints from the upstream and the test file for invalid type info
This PR adds lowering of `BlockAddressOp`. It uses two maps, `blockInfoToTagOp` and `unresolvedBlockAddressOp`, to defer matching `mlir::LLVM::BlockAddressOp` to its corresponding `mlir::LLVM::BlockTagOp` in cases where the matching label has not yet been emitted. If the `BlockTagOp` is not emitted, a placeholder value `std::numeric_limits<uint32_t>::max()` is used, which will later be resolved in `resolveBlockAddressOp`. Support for indirect goto and label differences will be added in a future PR.
This PR adds supports for __builtin_ia32_cmpnltps/cmpnltpd. Depends on llvm#1893.
…as (llvm#1922) The format now has the following assembly form: `$kind $src : type($src) -> type($result) attr-dict` This unifies CIR operation formats by removing unnecessary parentheses and using `->` consistently to denote result types.
This also fixes GlobalOp roundtrip.
This allows generation of the CIR tablegen targets without going through ``` $(ninja -C build_pilot -t targets all | grep IncGen | sed 's/:.*//') ```
Backport DesignatedInitUpdateExpr for AggregateExpr from the upstream
Backport AtomicExpr for ComplexType from the upstream
Backport CXX new for ComplexType with init from the upstream
) This PR fixes an error I found while working on `cir.indirectbr`. The issue occurs when a branching operator points to the entry block LLVM’s verifier does not allow this https://github.com/llvm/clangir/blob/10f2ee11fa61bb1550819ed54a5b0e111d9243aa/mlir/lib/IR/Verifier.cpp#L205-L208 Previously, in `cir.label`, when building a block, we only checked if the current block was not empty. Now, we also check if we are in the entry block. If we are, a new block is created instead. This change also helps emit IR that is closer to the classic code behavior.
This backports some minor changes to DynamicCastInfoAttr that were suggested during the upstreaming review. Specifically, the parameter names are changed to snake_case and the assembly format uses `struct()`.
656da33
to
8892d4f
Compare
Please let me know when this is ready for review again |
Sure, this is one of the PR I am currently working on. Thanks for your patience 🤝 |
`clang/lib/CIR/Dialect/Transforms`.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The PR 1742 contains too many conflicts with the main branch.
Got it re-patched to the current main branch.